Controlled and Balanced Dataset for Japanese Lexical Simplification

نویسندگان

  • Tomonori Kodaira
  • Tomoyuki Kajiwara
  • Mamoru Komachi
چکیده

We propose a new dataset for evaluating a Japanese lexical simplification method. Previous datasets have several deficiencies. All of them substitute only a single target word, and some of them extract sentences only from newswire corpus. In addition, most of these datasets do not allow ties and integrate simplification ranking from all the annotators without considering the quality. In contrast, our dataset has the following advantages: (1) it is the first controlled and balanced dataset for Japanese lexical simplification with high correlation with human judgment and (2) the consistency of the simplification ranking is improved by allowing candidates to have ties and by considering the reliability of annotators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation Dataset and System for Japanese Lexical Simplification

We have constructed two research resources of Japanese lexical simplification. One is a simplification system that supports reading comprehension of a wide range of readers, including children and language learners. The other is a dataset for evaluation that enables open discussions with other systems. Both the system and the dataset are made available providing the first such resources for the...

متن کامل

The Effect of Reducing Lexical and Syntactic Complexity of Texts on Reading Comprehension

The present study investigated the effect of different types of text simplification (i.e., reducing the lexical and syntactic complexity of texts) on reading comprehension of English as a Foreign Language learners (EFL). Sixty female intermediate EFL learners from three intact classes in Tabarestan Language Institute in Tehran participated in the study. The intact classes were assigned to three...

متن کامل

Japanese Lexical Simplification for Non-Native Speakers

This paper introduces Japanese lexical simplification. Japanese lexical simplification is the task of replacing complex words in a given sentence with simple words to produce a new sentence without changing the original meaning of the sentence. We propose a method of supervised regression learning to estimate complexity ordering of words with statistical features obtained from two types of Japa...

متن کامل

A Dataset for the Evaluation of Lexical Simplification

Lexical Simplification is the task of replacing individual words of a text with words that are easier to understand, so that the text as a whole becomes easier to comprehend, e.g. by people with learning disabilities or by children who learn to read. Although this seems like a straightforward task, evaluating algorithms for this task is not so. The problem is how to build a dataset that provide...

متن کامل

Benchmarking Lexical Simplification Systems

Lexical Simplification is the task of replacing complex words in a text with simpler alternatives. A variety of strategies have been devised for this challenge, yet there has been little effort in comparing their performance. In this contribution, we present a benchmarking of several Lexical Simplification systems. By combining resources created in previous work with automatic spelling and infl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016